st situations it is infeasible to have new data on hand to test a
d model. Therefore an in-house test will employ part of the
data to mimic a real-world out-house test on new data. This part
ailable data is commonly reserved from the whole available data
called a testing data set. In contrast, the data used to construct a
is called a training data set. The final or ultimate evaluation of a
d machine learning model must be based on the performance of
alisation test applied on a testing data set.
mmon exercise for the generalisation test of a supervised machine
model is to randomly divide a collected data set into two parts.
em is used as the training data set and the other is reserved as the
ata set. However, it can be difficult to guarantee a robust division
available data so as to deliver unbiased generalisation
nce test. The robustness here means whether such a random
can generate a testing data set which represents the data
on of a training data set. This is a very important point to address
e key of a successful generalisation test process.
approaches have been used for a generalisation test for a
d machine learning model so far. They are the random sampling
, the K-fold cross-validation approach [Devijver and Kittler,
shop, 1996] and the Jackknife test [Quenouille, 1949; Efron and
81].
andom sampling is an application of the Monte Carlo simulation
lis, 1987]. The basic principle of the Monte Carlo simulation is
amples from a population many times to estimate a parameter. It
that when the sampling time increases, the trend of a parameter
pproaching to the true one. Therefore, a parameter can be well-
d if sufficient sampling times can be guaranteed. To demonstrate
works, a one-dimensional Gaussian distribution centred at zero
erated with one million data points. Many samples were then
om this population with a varying number from ten to 10,000 to
the mean of the Gaussian distribution. Figure 3.13 shows the
n result. It can be seen that the estimated mean of the Gaussian
on was gradually converged to the true sample mean, i.e., zero.